Establishment of cancer-associated fibroblasts-related subtypes and prognostic index for prostate cancer through single-cell and bulk RNA transcriptome

Current evidence indicate that cancer-associated fibroblasts (CAFs) play an important role in prostate cancer (PCa) development and progression. In this study, we identified CAF-related molecular subtypes and prognostic index for PCa patients undergoing radical prostatectomy through integrating single-cell and bulk RNA sequencing data. We completed analyses using software R 3.6.3 and its suitable packages. Through single-cell and bulk RNA sequencing analysis, NDRG2, TSPAN1, PTN, APOE, OR51E2, P4HB, STEAP1 and ABCC4 were used to construct molecular subtypes and CAF-related gene prognostic index (CRGPI). These genes could clearly divide the PCa patients into two subtypes in TCGA database and the BCR risk of subtype 1 was 13.27 times higher than that of subtype 2 with statistical significance. Similar results were observed in MSKCC2010 and GSE46602 cohorts. In addtion, the molucular subtypes were the independent risk factor of PCa patients. We orchestrated CRGPI based on the above genes and divided 430 PCa patients in TCGA database into high- and low- risk groups according to the median value of this score. We found that high-risk group had significant higher risk of BCR than low-risk group (HR: 5.45). For functional analysis, protein secretion was highly enriched in subtype 2 while snare interactions in vesicular transport was highly enriched in subtype 1. In terms of tumor heterogeneity and stemness, subtype 1 showd higher levels of TMB than subtype 2. In addition, subtype 1 had significant higher activated dendritic cell score than subtype 2. Based on eight CAF-related genes, we developed two prognostic subtypes and constructed a gene prognostic index, which could predict the prognosis of PCa patients very well.


Scientific Reports
| (2023) 13:9016 | https://doi.org/10.1038/s41598-023-36125-0 www.nature.com/scientificreports/ cells, endothelial cells, and myeloid-derived mesenchymal stem cells in tumor tissues can also differentiate into CAFs [18][19][20][21] . CAFs can secrete growth factors and cytokines to inhibit the function of immune cell, thereby promoting tumor progression, invasion and migration 18,20 . In addition, CAFs can synthesize and remodel the extracellular matrix, which forms the penetration barriers to prevent the penetration of the drug and the immune cells into the tumor tissue, thus reducing the effectiveness of the tumor therapy 19,21 . Several studies indicate that high density of CAFs is associated with poor BCR-free survival in PCa patients 23,24 . Therefore, an insight into the relationship between CAFs in PCa and tumor metastasis, proliferation, progression, and drug resistance is essential. In our study, we constructed CAF-related gene prognostic index (CRGPI) based on 8 CAF-related genes identified by single-cell and bulk RNA transcriptome. Based on the non-negative matrix factorization (NMF) algorithm and the 8 identified CAF-related genes, we divided these patients into two subgroups and validated its reliability in several external datasets, which could predict the prognosis of PCa patients and better guide clinical application in the future.

Methods
Data preparation. In terms of single-cell RNA sequencing level, we downloaded 453 markers related to CAFs from the tumor immunotherapy gene expression resource (TIGER) database (http:// tiger. cance romics. org/#/ singl eCell Immune 25 , which contains single-cell transcriptome data of 2,116,945 immune cells from 655 samples including PCa 26 . For bulk RNA sequencing level, we used the PCa gene matrix and clinical features in TCGA database from our pervious study 14 . CAF abundance was calculated using EPIC algorithm 27 owing to its higher correlation than other immune algorithms 23,28 and prognosis analysis was conducted. Weighted gene co-expression network analysis (WGCNA) was used to caculate the CAF-related genes in TCGA database. After intersection of CAF markers and CAF-related genes, we detected the final genes using Lasso regression analysis. Subsequently, we constructed risk score using coefficients in Lasso regression and TCGA subtypes using nonnegative matrix factorization (NMF) method based on these genes. CAFrelated gene prognostic index (CRGPI) = − 0.108359658017165*NDRG2-0.170784032731384*TSPAN1-0.0575930783435198*PTN + 0.084664542685909*APOE-0.0308048566779615*OR51E2-0.0450336-720010116*P4HB + 0.0392873624467431*STEAP1-0.0148044294523877*ABCC4. In TCGA database, 430 samples with complete BCR information were used, and the p value was smaller than 0.05 using log-rank test for BCR-free survival. Two cohorts were used externally validated the prognostic values of risk score and TCGA subtypes, including GSE46602 29 and MSKCC2010 30,31 . The clinical features of molecular subtypes were analyzed.
Mutaion landscape and functional diferences between two subtypes. RNA-sequencing profiles, genetic mutation and corresponding clinical information for PCa were downloaded from the TCGA database (https:// portal. gdc. com). The data of mutations were downloaded and visualized using the maftools package in R software. Differences of mutation frequency between two subtypes were also conducted. In terms of funcational analysis, gene set enrichment analysis (GSEA) was performed using "c2.cp.kegg.v7.4.symbols.gmt" and "h.all.v7.4.symbols.gmt" from the molecular signatures database 32,33 . Based on gene expression and subtypes, the minimum gene set was defined as 5 while the maximum gene set was 5000. Resampling was performed as 1000 times. P value of < 0.05 and a false discovery rate of < 0.05 were considered statistically significant.
Tumor immune microenvironmen (TME). The overall tumor microenvironment and immune components assessment were calculated by Cibersort and ESTIMATE algorithms [40][41][42] . The differences of 54 immune checkpoints and tumor microenvironment scores betwenn the two subytpes were analyzed by the Wilcoxon rank sum test. Figure 1 illustrates the study flowchart.
Statistical analysis. We completed analyses using software R 3.6.3 and its suitable packages. We exploited the Wilcoxon test in the context of abnormal distribution. Survival analysis was conducted through log-rank test and presented as Kaplan-Meier curve. Statistical significance was set as two-sided p < 0.05. Significant marks were as follows: not significance (ns), p ≥ 0.05; *p < 0.05; **p < 0.01; ***p < 0.001.

Results
Single cell and bulk RNA sequencing identified CAF-related markers and constructed TCGA subtypes and CRGPI. Using EPIC algorithm, we found that CAF was significantly associated with BCR-free survival in 430 PCa patients in TCGA database ( Fig. 2A). Using WGCNA, three modules were found (Fig. 2B) where grey module genes was highly associated with CAF (Fig. 2C). In terms of single cell RNA sequencing data, Fig. 2D showed good quality control and tumor tissue consisted of eight types of cells, including fibroblasts ( Fig. 2E), whose markers could be seen in Fig. 2F. Subsequently, we extracted 453 CAF markers using the defini- www.nature.com/scientificreports/ tion of absolute value of logFC ≥ 0.4 and p value < 0.05 ( Fig. 2D-F). In TCGA database, 277 genes were related to CAF using WGCNA. 73 genes were obtained from intersection of CAF-related genes and CAF markers (Fig. 2G) and these genes were involved in focal adhesion, proteoglycans in cancer and TCG-beta signaling pathway (Fig. 2H). Through Lasso regression analysis, when lambda (λ) equals 0.03, we obtained the optimal model ( Fig. 2I) and used NDRG2, TSPAN1, PTN, APOE, OR51E2, P4HB, STEAP1 and ABCC4 for subsequent analysis (Fig. 2J). These genes could clearly divide the PCa patients into two subtypes in TCGA database (Fig. 3A) and the BCR risk of subtype 1 was 13.27 times higher than that of subtype 2 with statistical significance (Fig. 3B). Similar results were observed in MSKCC2010 (Fig. 3C,D) and GSE46602 cohorts (Fig. 3E,F). The baseline comparsion showed balanced clinical features between subtype 1 and subtype 2 ( Table 1). In addtion, the molucular subtypes were the independent risk factor of PCa patients (Table 2). We orchestrated CRGPI based on the above genes and divided 430 PCa patients in TCGA database into high-and low-risk groups according to the median value of this score. We found that high-risk group had significant higher risk of BCR than low-risk group (HR: 5.45; Fig. 3G). Other two cohorts showed similar results (Fig. 3H,I). The top ten genes between subtype 1 and subtype 2 were NYNRIN, PTCHD4, WNK1, CNNM4, ARFGEF1, HRAS, PYHIN1, ARHGEF2, MYOM1 and ITGB6 with statistical significance (Fig. 3J).
Functional enrichment, TME evaluation and tumor heterogeneity and stemness. For functional analysis, protein secretion was highly enriched in subtype 2 ( Fig. 4A) while snare interactions in vesicular transport was highly enriched in subtype 1 (Fig. 4B). In terms of tumor heterogeneity and stemness, subtype 1 showd higher levels of TMB than subtype 2 (Fig. 4C). The expression levels of ICOS, CTLA4 and TNFRSF8 were   www.nature.com/scientificreports/ significantly higher in subtype 1 than those in subtype 2 (Fig. 4D). In addition, subtype 1 had significant higher activated dendritic cell score than subtype 2 (Fig. 4D).

Discussion
CAFs have been shown to promote tumor growth and progression in a variety of ways, such as secreting extracellular matrix proteins, inducing inflammation and neovascularization, increasing angiogenesis and constructing the immunosuppressive TME 18,43 . In PCa, CAFs are the most abundant cells in the stroma of TME 14 and play an important role in the development and invasion of PCa 44,45 . For instance, CAFs can secrete IL-6 to enable AR transcriptional activity in PCa cells by modulating MAPK,STAT3,and PI3K/AKT signaling, thereby inducing resistance to anti-androgen therapies 44,46,47 . CAFs could mediate a series of genes such as NF-κB-dependent expression of the Wnt family member WNT16B to promote EMT in PCa cells, which induced resistance to cytotoxic agents 48 . Moreover, the role of CAFs in PCa bone metastasis has also been reported that CAFs may promote PCa bone metastasis by fibronectin and collagen deposition and establishing protein interaction network 49 . The important role of CAFs in tumorigenesis and development makes us aware that target CAFs will be an attractive breakthrough in anti-cancer therapy. In our study, we identified eight CAF-related genes by combining with single-cell and bulk RNA transcriptome, including STEAP1, APOE, OR51E2, PTN, ABCC4, TSPAN1, P4HB and NDRG2. Studies on the role of STEAP1, OR51E2, PTN, ABCC4, NDRG2 have been relatively in-depth. For instance, STEAP1 is all-called six-transmembrane epithelial antigen of the prostate 1, which belongs to a family of metalloproteinases involved in iron and copper homeostasis and other cellular processes 37 . STEAP1 is overexpressed on the plasma membrane of PCa cells and is associated with PCa invasiveness and metastasis 50 . www.nature.com/scientificreports/ The possible mechanism of STEAP1 promoting tumor proliferation and metastasis is by acting as a channel for small molecules that are involved in intercellular communication 50,51 . Similarly,OR51E2 is a kind of olfactory receptor, which is also called prostate specific G-protein receptor 2, and is highly expressed in PCa 52 . Some studies reported that OR51E2 perhaps could be used as one of the biomarkers for PCa [53][54][55][56] . Moreover, many researchers have shown that the activation of OR51E2 can inhibit proliferation of PCa cells and induce their invasion [57][58][59][60] . PTN encodes pleiotrophin, which is a secreted growth factor involved in angiogenesis and tumor growth 61 .A recent study showed that serum pleiotrophin levels were increased in the high-risk group compared with benign and low-risk PCa patients 62 , which is consistent with our previous study 63 . The protein encoded by ABCC4 is a member of the superfamily of ATP-binding cassette (ABC) transporters, which is also called multidrug resistance protein 4 (MRP4) 64 . MRP4 was reported to be associated with drug resistance of PCa in many studies [65][66][67] . NDRG2 is a tumor suppressor gene that suppresses tumorigenesis and metastasis and increases the sensitivity to anticancer drugs 68 . Several researches have reported that the overexpression of NDRG2 could suppress the invasion and metastasis of PCa cells 69,70 . Some researchers found that low level of NDRG2 were associated with radioresistance of PCa Cells and overexpression of NDRG2 in combination with radiotherapy might be an effective therapeutic method for PCa 71,72 . In summary, the function of these five genes in PCa is relatively determined. There are very few reports of TSPAN1 and P4HB in PCa. The overexpression of TSPAN1 is widely regarded as with EMT, tumor proliferation and migration, tumor growth which has been demonstrated in multiple cancer types [73][74][75][76][77] . In PCa, TSPAN1 was induced by androgens and upregulated in PCa tissues. TSPAN1 involved in controlling the expression of key proteins of cell migration to promote invasion and metastasis of PCa 78 . In addition, another study found that the knockdown of TSPAN1 in PCa cell could inhibit cell proliferation and migration 79 . However, the detailed mechanisms of regulating the proliferation and metastasis of PCa cells remain incompletely elucidated and require further investigation. There is only one study on the role of P4HB in PCa. Hu et al. constructed an autophagy-related gene signature containing P4HB and other genes in PCa through bioinformatic analysis 80 . But there are many studies on the role of P4HB in other diseases, such as bladder carcinoma, oesophageal cancer and kidney renal clear cell carcinoma [81][82][83] . There seems to be some debates about the role of APOE in PCa. A study showed that different apolipoprotein E genotypes may have different risks of developing PCa 84 . However, another study showed apolipoprotein E genotype was not associated with PCa risk 85 . Accurate conclusions may need to be supported by more experimental evidence. In summary, based on the eight identified genes, we divided 430 PCa patients into 2 subtypes. We observed the prognosis of subgroup 1 was significantly worse than that of subgroup 2. To better reveal the potential mechanisms of the two prognostic subtypes, we performed functional analysis, gene mutation, tumor heterogeneity and stemness, and TME assessment. Our study found protein secretion was highly enriched in subtype 2 while snare interactions in vesicular transport was highly enriched in subtype 1.Snare is all-called soluble N-ethylmaleimide-sensitive factor attachment protein receptors, which plays an important role in tumor invasion, chemo-resistance, autophagy, apoptosis, and phosphorylation of kinases essential for cancer cell biogenesis 86 . Snare can promote close proximity between vesicles and cell membrane to form a fusion pore after contact, which can mediate the transport of material between vesicles and cells 86,87 . Therefore, the enrichment of snare interactions in vesicular transport pathway in cluster 1 may indicate a higher level of proliferation and invasion demand for PCa cells in cluster 1, correlating with a worse prognosis. In contrast, we observed that protein secretion was mainly enriched in subtype 2 and we speculated it may be related to the increasing androgen signaling. Compared with hormone-naive metastatic PCa, hormone-refractory metastatic PCa showed a marked decrease of androgen signaling and protein biosynthesis, which possibly meant the decrease of androgen signaling and protein biosynthesis was responsible for PCa progression and higher malignancy of tumor cells 88 . This is agreement with our study showing patients in subtype 2 have better prognosis. Additionally, the difference of TMB between subtype 1 and subtype 2 may also explain why the prognosis of subtype 1 is much worse than subtype 2. We observed TMB of subtype 1 was significantly higher than subtype 2. TMB refers to the total number of mutations present in a single tumor specimen, which can be used to predict the tumor response to immunotherapy in a variety of tumors 89 . Patients with high TMB usually mean that tumor cells carry more tumor antigens on their surface, which are therefore vulnerable to being killed by activated immune cells 89,90 . In PCa, high TMB level was significantly associated with older age, positive lymph node, higher international Society of Urological Pathology (ISUP) grade, advanced stage and poor BCR-free survival 91 . In metastatic castration-resistant PCa, ICIs were more effective than taxanes for patients with high TMB(10 mt/Mb or greater) 92 . These two studies showed PCa patients with higher TMB were associated with bad prognosis and better effect of immunotherapy, which was consistent with our study and may explain the poor prognosis of subgroup 1 to some extent. Furthermore, we compared the differences in the common immune checkpoints between subtype 1 and 2 and found the expression of ICOS, CTLA4 and TNFRSF8 were significantly elevated in subtype 1.It is well known that a significant mechanism of tumor cells evading immunosurveillance is activation of immune checkpoint pathways, which can suppress antitumor immune response 93 . CTLA-4 has been widely studied in tumors. High expression of CTLA-4 was observed to be associated with a worse prognosis in many cancers, including PCa, and the intervention of CTLA-4 blockers could improve prognosis of patients [94][95][96][97][98] . High expression of CTLA-4 can inhibit T cell activation by competing with CD28, regulating the inhibitory function of Treg cells and controlling adhesion and motility, which in turn leads to immunosuppression of tumors. Similarly, ICOS is an activating costimulatory immune checkpoint expressed on activated T cells, which participate in regulating T cell activation and adaptive immune responses. Mo L et al. adopt a combination approach of ICOS positive Treg cells depletion with tumor cell vaccine in mouse PCa model and found ICOS blocking could deplete the tumor-infiltrated ICOS positive Treg cells 99 . These findings may well explain our results, because high expression of CTLA-4 and ICOS may mean stronger immunosuppression, which then leads to a worse prognosis of the subtype 1 patients. Moreover,TNFRSF8 is an important therapeutic target for the treatment of malignant lymphomas, but there are few studies on TNFRSF8  100 . Only a recent study showed higher plasma TNFRSF8 were associated with shorter PFS in patients received androgen deprivation therapy plus cabozantinib 101 . More evidence is needed regarding whether the high expression of TNFRSF8 is associated with poor prognosis. Intriguingly, we observed in the comparison of dendritic cells activated, subtype 1 was lower than subtype 2. The median difference between the two groups was 0 (0-0.002), and the difference was statistically significant. There are already many evidences that dendritic cells plays an important role in antitumor immunity 102 . Dendritic cells can capture tumor antigens that are released from tumor cells and present them to T cells, eventually resulting in the generation of cytotoxic T lymphocytes (CTLs) 103 . This may indicate that with increasing dendritic cells activation, the antigen presentation function is increasing, and then the number and function of CTLs is increasing, ultimately leading to a stronger antitumor immune response. In addition, the success of Sipuleucel-T (a dendritic cell based immunotherapy) in prostate tumor therapy reflects the great value of immunotherapy targeting dendritic cells 104 . In PCa, there have been a phase 3 trial showing that Sipuleucel-T significantly improve overall survival in patients with metastatic castration-resistant PCa 105 .These studies are consistent with our results and reveal the important role of dendritic cells activated in the prognosis of patients with PCa.

Conclusions
Based on eight CAF-related genes, we developed two prognostic subtypes and constructed a gene prognostic index, which could predict the prognosis of PCa patients very well. Meanwhile, our study also provides a valuable resource for understanding the underlying mechanisms of PCa progression, and provides valuable insights into the management of the BCR in PCa.